Efficient Inferencing of Compressed Deep Neural Networks

نویسندگان

  • Dharma Teja Vooturi
  • Saurabh Goyal
  • Anamitra R. Choudhury
  • Yogish Sabharwal
  • Ashish Verma
چکیده

Large number of weights in deep neural networks makes the models difficult to be deployed in low memory environments such as, mobile phones, IOT edge devices as well as “inferencing as a service” environments on cloud. Prior work has considered reduction in the size of the models, through compression techniques like pruning, quantization, Huffman encoding etc. However, efficient inferencing using the compressed models has received little attention, specially with the Huffman encoding in place. In this paper, we propose efficient parallel algorithms for inferencing of single image and batches, under various memory constraints. Our experimental results show that our approach of using variable batch size for inferencing achieves 15-25% performance improvement in the inference throughput for AlexNet, while maintaining memory and latency constraints.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Diagnosis of Brucellosis in Rafsanjan City Using Deep Auto-Encoder Neural Networks

Introduction: Brucellosis is considered as one of the most important common infectious diseases between humans and animals. Considering the endemic nature of brucellosis and the existence of numerous reports of human and animal cases of brucellosis in Iran, the incidence of human brucellosis in Rafsanjan city was determined in the last 3 years (2016–2018). The main objective of this study was t...

متن کامل

The Diagnosis of Brucellosis in Rafsanjan City Using Deep Auto-Encoder Neural Networks

Introduction: Brucellosis is considered as one of the most important common infectious diseases between humans and animals. Considering the endemic nature of brucellosis and the existence of numerous reports of human and animal cases of brucellosis in Iran, the incidence of human brucellosis in Rafsanjan city was determined in the last 3 years (2016–2018). The main objective of this study was t...

متن کامل

Estimation of Hand Skeletal Postures by Using Deep Convolutional Neural Networks

Hand posture estimation attracts researchers because of its many applications. Hand posture recognition systems simulate the hand postures by using mathematical algorithms. Convolutional neural networks have provided the best results in the hand posture recognition so far. In this paper, we propose a new method to estimate the hand skeletal posture by using deep convolutional neural networks. T...

متن کامل

Towards Image Understanding from Deep Compression without Decoding

Motivated by recent work on deep neural network (DNN)-based image compression methods showing potential improvements in image quality, savings in storage, and bandwidth reduction, we propose to perform image understanding tasks such as classification and segmentation directly on the compressed representations produced by these compression methods. Since the encoders and decoders in DNN-based co...

متن کامل

A multi-scale convolutional neural network for automatic cloud and cloud shadow detection from Gaofen-1 images

The reconstruction of the information contaminated by cloud and cloud shadow is an important step in pre-processing of high-resolution satellite images. The cloud and cloud shadow automatic segmentation could be the first step in the process of reconstructing the information contaminated by cloud and cloud shadow. This stage is a remarkable challenge due to the relatively inefficient performanc...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1711.00244  شماره 

صفحات  -

تاریخ انتشار 2017